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Abstract 

We design mechanisms for online procurement of data held by strategic agents for 
machine learning tasks. We study a model in which agents cannot fabricate data, but 
may lie about their cost of furnishing their data. The challenge is to use past data to 
actively price future data in order to obtain learning guarantees, even when agents’ 
costs can depend arbitrarily on the data itself. We show how to convert a large class 
of no-regret algorithms into online posted-price and learning mechanisms. Our results 
parallel classic sample complexity guarantees, but with the key resource constraint 
being money rather than quantity of data available. With a budget constraint B, we 
give robust risk (predictive error) bounds on the order of In many cases our 

guarantees are significantly better due to an active-learning approach that leverages 
correlations between costs and data. 

Our algorithms and analysis go through a model of no-regret learning with T arriving 
pairs (cost, data) and a budget constraint of B, coupled with the “online to batch 
conversion”. Our regret bounds for this model are on the order of T/ '/B and we give 
lower bounds on the same order. 


1 Introduction 

The rising interest in the held of Machine Learning (ML) has been strongly driven by 
the potential to generate economic valne. Firms seeking revenne optimizations can gather 
abnndant data at low cost, apply a set of inexpensive algorithmic tools, and prodnce high- 
accnracy predictors that can massively improve fntnre decision making. The extent of the 
potential valne that can be created by leveraging data for prediction is apparent in the 
multi-million dollar competition bounties offered by companies like Nethix and the Heritage 
Health Foundation, but perhaps even more so in the aggressive hiring of many ML experts 
by companies like Google and Facebook. 
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Much of the theoretical results in ML aim to measure, at least implicitly, the economic 
efficiency of learning problems. For example, in certain settings we have a reasonably thorough 
understanding of sample complexity [I] which gives us the precise tradeoff between u, the 
quantity of data at our disposal, and the error or loss rate we want to achieve. Reducing error 
is always benehcial, of course, but must be weighed against the marginal cost of increasing n. 

The measures of efficiency in ML have broadened in recent years, in particular because 
gathering data is typically orders of magnitude cheaper than labeling it. This has led to 
the emergence of the active learning paradigm [MEiin]. Here, we imagine an interface 
between the learner and the label provider, where the learner may make label queries on data 
points in an online fashion. By sequentially choosing which data to label, the learner can 
greatly reduce the number of labels required to learn [T^ . 

A problem that has received little attention in the learning theory literature is the 
monetary efficiency of learning when data have differing costs. Indeed, real-world prediction 
tasks often require obtaining examples held by self-interested, strategic agents; these agents 
must be incentivized to provide the data they hold, and they have heterogeneous costs for 
doing so. 

In this vein, the present paper seeks to address the following question; 

In a world where data is held by self-interested agents with heterogeneous costs 
for providing it, and in particular when these costs may be arbitrarily correlated 
with the underlying data, how can we design mechanisms that are incentive- 
compatible, have robust learning guarantees, and optimize the cost-efficiency 
tradeoffs inherent in the learning problem? 

This question is relevant to many real-world scenarios involving hnancial and strategic 
considerations in data procurement. Here are two examples: 

1. In the development of a certain drug, a pharmaceutical company wishes to train a disease 
classiher based on data obtained by hospitals and stored in patients’ medical records. 
These data are not public, yet the company can offer hospital patients hnancial incentives 
to contribute their private records. We note the potential for cost heterogeneity: the 
compensation required by patients may be correlated with the content of their medical 
data {e.g. if they have the disease). 

2. Online retailers generally hope to know more about website visitors in order to better 
target products to customers. A retailer can offer to buy customers’ demographic and 
social data, say in the form of access to their Facebook prohle. But again, customers’ 
willingness to sell may covary with their demographics data in an unknown way. 

Prom sample complexity to budget efficiency 

The classical problem in statistical learning theory is the following. We are given n datapoints 
(examples) zi,... ,Zn & Z sampled from some distribution V. Our goal is to select a hypothesis 
h eT-L which “performs well” on unseen data from V. We can specify performance in terms 
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of a loss function i{h, z), and we write £(h), known as the risk of h, to be the expectation of 
£(h, z) on a random draw 2 ; from V. The goal is to produce a hypothesis h whose risk is not 
much more than that of h*, the optimal member of H. For example, in binary classification, 
each data point consists of a pair 2 ; = (x, y) where x encodes some “features” and y G { — 1,1} 
is the label; a hypothesis h is a function that predicts a label for a given set of features; and 
a typical loss function, the “0-1 loss”, is dehned so that £{h, {x,y)) = 0 when h{x) = y and 
i{h, {x,y)) = 1 otherwise. 

Research in statistical learning theory attempts to characterize how well such tasks can be 
performed in terms of the resources available and the inherent dijfieulty of the problem. The 
resource is usually the quantity of data n. In binary classihcation, for instance, the difficulty 
or richness of the problem is captured by the “VC-dimension” d, and a famous result jlH] is 
that there is an algorithm achieving the bound 

C(k)<C(,f) + oU^Y ( 1 ) 


with very high probability over the sample zi,... ,Zn- 

In the present work we consider an alternative scenario; the learner has a fixed budget 
B and can use this budget to purchase examples. More precisely, on round t of a sequence 
of T rounds, agent t arrives with data point Zt, sampled i.i.d. from some P, and a cost 
Ct G [0,1]. This cost q is known only to the agent and can depend arbitrarily on zt. The 
learning mechanism may offer a (possibly randomized) menu of take-it-or-leave-it prices 
TTt : Z ^ R_i_, with a possibly different price Tifiz) for each data point 2 ;. The arriving agent 
observes the price 7it{zt) offered for her data and accepts as long as 7it{zt) > Q, in which case 
the mechanism pays the agent Trfizt) and learns (q, Our goal is to actively select prices 
to offer for different datapoints, subject to a budget B, in order to minimize the risk of our 
hnal output h. 

At a high level, our main result parallels the classical statistical learning guarantee in ([^, 
but where the limited resource is the budget B instead of the sample size n. 

Main Result 1 (Informal). For a large class of problems, there is an active data purchasing 
algorithm A that spends at most B in expectation and outputs a hypothesis h satisfying, 

ECCh) < Hh-) + o(JF^, 

where ^ [0,1] is an algorithm-dependent parameter of the (cost, data) sequence capturing 
the monetary difficulty of learning and the expectation is over the algorithm’s internal 
randomness. 

This bound depends on the quantity 'Jt,a which captures the monetary difficulty of the 
problem at hand. (We also need as prior knowledge a rough estimate of '^t,a-) This is in 

^We will discuss the interaction model further in Sections and 
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Figure 1; Algorithmic and analytic approach. First, we convert Follow-the-Regularized- 
Leader online no-regret algorithms into mechanisms that purchase data for a regret-minimization 
setting that we introduce for purposes of analysis. Then, we convert these into mechanisms to solve 
our main problem, statistical learning. The mechanisms interact with the online learning algorithms 
as black boxes, but the analysis relies on “opening the box”. 


rough analogy with VC-dimension in classical bounds such as Equation Similarly, the key 
resource constraint is now the bndget B rather than the qnantity of data n. 

It is important to note that '^t,a depends on the choice of algorithm A. However, our 
results also include simpler, algorithm-independent bounds. For instance, replace 7 t ,.4 by 
where jj, is the mean of the arriving costs, and Main Result continues to hold (and the only 
prior knowledge required is a rough estimate of /i). But 'yT,A can be significantly smaller than 
y/p when there are particular correlations between the costs and the examples; indeed, we 
can have 7r,yi —!■ 0 even as /i stays constant. This indicates a case in which the average cost 
of data is high, but due to beneficial correlations between costs and data, our mechanism can 
obtain all the data it needs for good learning very cheaply. We give a thorough discussion of 


7T,yi in Section 4.4 


Overview of Techniques 

Our general idea for attacking this problem is to utilize online learning algorithms (OLAs) 
for regret minimization |6] . These algorithms output a hypothesis or prediction at each step 
t = 1,..., T, and their performance is measured by the summed loss of these predictions over 
all the steps. The idea is that the hypotheses produced by the OLA at each step can be used 
both to determine the valne of data dnring the procurement process and to generate a final 
prediction. 

In Section we lay out the tools we need for a pricing and learning mechanism to interact 
with OLAs. The first high-level problem is that, becanse of the budget constraint, our OLA 
will only see a small subset of the data sequence. We use the tool of importance-weighting to 
give good regret-minimization guarantees even when we do not see the entire data sequence. 
The second problem is how to aggregate the hypotheses of the OLA and convert its regret 
guarantee into a risk guarantee for our statistical learning setting. This is achieved with the 
standard “online-to-batch” conversion [7]. 
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Given the tools of Section the key remaining challenge is to develop a pricing and 
learning strategy that achieves low regret. We address this question in Section]^ We formally 
define a model of online learning for regret minimization with purchased data, in which the 
mechanism must output a hypothesis at each time step and perform well in hindsight against 
the entire data sequence, but only has enough budget to purchase and observe a fraction of 
the arriving data. We defer until later our detailed analysis of this setting, derivation of a 
pricing strategy, and lower bounds. At this point, we present our pricing strategy and regret 
guarantees for this setting. 

In Section we give our main results: risk guarantees for a learner with budget B and 
access to T arriving agents. These bounds follow directly by using the tools in Section]^ and 
regret-minimization results in Section 

In Section we develop a deeper understanding of the regret minimization setting. We 
derive our pricing strategy from an in-depth analysis of a more analytically tractable variant 
of the problem, the “at-cost” setting, where the mechanism is only required to pay the cost 
of the arriving data point rather than the price posted. For this setting, we are able to 
derive the optimal pricing strategy for minimizing the regret bound of our class of learning 
algorithms subject to an expected budget constraint. 

We also complement our upper bounds by proving lower bounds for data-purchasing 
regret minimization. These show that our mechanisms for the easier at-cost setting have 
an order-optimal regret guarantee of There is a small gap to our mechanisms for 

the main regret minimization setting, in which our guarantee is on the order of yJlT,A 

(recall that 'yT,A ^ [0; 1]; so this is a weaker guarantee). The dependence T /approaches 
the classic y/T regret bound when B is large (approaching T). When B is small but still 
superconstant, we observe the perhaps counterintuitive fact that we can achieve o(l) average 
regret per arrival while only observing an o(l) fraction of the arriving data; in other words, 
we have “no data, no regret.” 

Related Work 

For “batch” settings in which all agents are offered a price simultaneously, pricing schemes for 
obtaining data have appeared in recent work, especially Roth and Schoenebeck [16], which 
considered the design of mechanisms for efficient estimation of a statistic. However, this work 
and others in related settings puniE] consider offline solutions, e.g. drawing a posted price 
independently for all data points. We focus on an active approach in which the marginal 
value of individual examples is estimated according to the current learning progress and 
budget. A data-dependent approach to pricing data does appear in Horel et al. [I3], but that 
paper focuses on a quite different learning setting, a model of regression with noisy samples 
with a budget-feasible mechanism design approach. 

Another difference from the above papers is that we prove risk and regret bounds rather 
than trying to minimize e.g. a variance bound, and we also consider a broader class of 
learning problems. 
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Other related work. Other works such as Dekel et al. [U], Ghosh et al. m, Meir et al. 
na focus on a setting in which agents may misreport their data (also see the peer-prediction 
literature). We suppose that agents may misreport their costs but not their data. 

Many of the ideas in the present work draw from recent advances in using importance 
weighting for the active learning problem |1]. There is a wealth of theoretical research into 
active learning, including Balcan et al. [2], Beygelzimer et al. [^, Hanneke [T2] and many 
others. 

“Budgeted Learning” is a somewhat related area of machine learning, but there the budget 
is not monetary. The idea is that we do not see all of the features of the data points in our 
set, but rather have a “budget” of the number of features we may observe (for instance, we 
may choose any two of the three features height, weight, age). 


2 Statistical Learning with Purchased Data 

In this section, we formally define the problem setting. The body of the paper will then 
consist of a series of steps for deriving mechanisms for this setting with provable guarantees, 
which will hnally appear in Section 

We consider a statistical learning problem described as follows. Our data points are 
objects z & Z. We are given a hypothesis class H which we will assume is parameterized 
by vectors but more broadly can be any Hilbert space endowed with a norm || ■ ||; for 
convenience we will treat elements h G H as vectors which can be added, scaled, etc. We are 
also given a loss function £ : Ti x Z ^ IR, that is convex in l-i. We assume throughout the 
paper that the loss function is 1-Lipschitz in h] that is, for any z ^ Z and any h,h' ^71 we 
have |£(h, z) — £{h\ z)\ < \\h — h'\\. 

In many common scenarios, Z is the space of pairs {x, y) from the cross product X x y, 
with X the feature input and y the label, though in our setting Z can be a more generic object. 
For example, in the canonical problem of linear regression, we have that Z = X xy = x R, 
the hypothesis class is vectors Ti = R*^, and the loss function is defined according to squared 
error £{h, (x, y)) := {hJx — y)'^. 

The data-purchasing statistical learning problem is parameterized by the data 
space Z, hypothesis space Ti, loss function I, number of arriving data points T, and expected 
budget constraint B. A problem instance consists of a distribution B on the set Z and a 
sequence of pairs (ci, zi),..., (ct, zt) where each zt is a data point drawn i.i.d. according to 
V and each c* G [0,1] is the private cost associated with that data point. The costs may be 
arbitrarily chosen, i.e. we consider a worst-case model of costs. (For instance, if costs and 
data are drawn together from a joint, correlated distribution, then this is a special case of 
our setting.) 

In this problem, the task is to design a mechanism implementing the operations “post”, 
“receive”, and “predict” and interacting with the problem instance as follows. 

• For each time step t = 1,..., T: 
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1. The mechanism posts a pricing function ttj ; 2 —)■ R, where is the price 

posted for data point 

2. Agent t arrives, possessing {ct,Zt). 

3. If the posted price nt{zt) > Ct, then agent t accepts the transaction: The mechanism 
pays 7rt{zt) to the agent and receives {ct,Zt). If 7it{zt) < Ct, agent t rejects the 
transaction and the mechanism receives a null signal. 

• The mechanism outputs a prediction h eH. 

Note that the mechanism is given the parameters Z, "H, i, T, and B, but the problem instance 
is completely unknown to the mechanism prior to to the arrivals. The design problem of 
the mechanism is how to choose the pricing function Tit to post at each time, how to update 
based on receiving data, and how to choose the final prediction. The risk or predictive error 
of a hypothesis is 

C{h) = E i{h,z) 

z^T> 

and the goal of the mechanism is to minimize the risk C{h) of its hnal hypothesis h. The 
benchmark is the optimal hypothesis in the class, h* = argmin/jg^ C{h). 

The mechanism must guarantee that, for every input sequence (ci, zi),..., {ct, zt), it 
spends at most B in expectation over its own internal randomness. 

Agent-mechanism interaction. The model of agent arrival and posted prices contains 
several assumptions. First, agents cannot fabricate data; they can only report data they 
actually have to the mechanism. Second, agents are rational in that they accept a posted 
price when it is higher than their cost and reject otherwise. Third, we have an implementation 
of the mechanism that can obtain the agent’s cost Ct when the transaction occurs. 

We emphasize that the purpose of this paper is not focused on the implementation of such 
a setting, but instead on developing active learning and pricing techniques and guarantees. 
This is also intended as a simple and clean model in which to begin developing such techniques. 
However, we briefly note some possible implementations. 

In the most straightforward one, the mechanism posts prices directly to the agent who 
responds directly. This would be a weakly truthful implementation, as agents have no 
incentive to misreport costs after they choose to accept the transaction. 

One strictly truthful implementation uses a trusted third party (TTP) that can facilitate 
the transactions (and guarantee the validity of the data if necessary). For example, we could 
imagine attempting to learn to classify a disease, and we could rely on a hospital to act as the 
broker allowing us to negotiate with patients for their data. Then the TTP/agent interaction 
could proceed as follows: 

1. Learning mechanism submits the pricing function tt* to the TTP; 

2. Agent provides his data point Zt and cost q to the TTP; 
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3. TTP determines whether 7rt{zt) > Ct and, if so, instructs the learner to pay to 

the agent and then provides the pair {zt-, Ct) to the learner. 

Other possibilities for strictly truthful implementation include using a bit of cryptography 
(see Section]^. 

3 Tools for Converting Regret-Minimizing Algorithms 

In this section we begin with the classic regret-minimization problem and a broad class of 
algorithms for this problem. We then show how to apply techniques that convert these 
algorithms into a form that will be useful for solving the statistical learning problem with 
purchased data. The only missing ingredient will then be a price-posting strategy, which will 
be presented in Section]^ 

3.1 Recap of Classic Regret-Minimization 

In the classic regret-minimization problem, we have a hypothesis class "H with the same 
assumptions as stated in SectionAt each time t = 1,... ,T the algorithm posts a hypothesis 
ht G H. Nature (the adversary, the environment, etc.) selects a 1-Lipschitz convex loss 
function ft '.V, ^ 1R|^ The algorithm observes ft and suffers loss ft{ht). 

The loss and regret of the algorithm on this particular input sequence are 

T 

Losst = '^ft{ht). ( 2 ) 

t=i 

T 

Regretj’ = Losst — ruin E/<('"•)■ (3) 

^ i=l 

The regret objective is what one typically studies in adversarial settings, where we want to 
discount the loss incurred by the algorithm by the loss suffered by the best possible h* chosen 
with knowledge of the sequence of ffs. As we often consider randomized algorithms, we will 
generally consider expected loss and regret, where the expectation is over any randomness 
in the algorithm not over the (possibly-randomized) input sequence of loss functions. An 
algorithm is said to guarantee regret R{T) if the latter provides an upper bound on regret 
for every sequence of loss functions /i,..., /r. 

We utilize the broad class of Follow-the-Regularized-Leader (FTRL) online algorithms 
(Algorithmic (TSU^O]- Special cases of FTRL include Online Gradient Descent, Multiplicative 
Weights, and others. Each FTRL algorithm is specihed by a convex function G : "H —?■ IR 
which is known as a regularizer and is usually strongly convex with respect to some norm. 
For example. Multiplicative Weights follows by using the negative entropy function as a 

^This definition of “loss function” is a departure from our main setting which involved t(-, •)■ But we will 
use this somewhat more general setup by choosing ft{h) oc tfh, zt) for the datapoint Zt- 



regularize!, which is strongly-convex with respect to norm [B]. Online Gradient Descent 
follows by using the regularize! G{h) = |||h|| 2 , which is strongly-convex with respect to £2 
norm. These special cases have efficient closed-form solutions to the update rule for computing 
ht+i- 


ALGORITHM 1: Follow-the-Regularized-Leader (FTRL). 
Input: learning parameter rj, convex regularize! G : "H —)■ R 

for t = 1,..., T do 

post hypothesis ht, observe loss function ft] 
update ht+i = inf,,g^ | ; 

end 


It is well-known (and indeed follows as a special case of Lemma 3.1) that, under the 
assumptions on our setting, FTRL algorithms guarantee an expected regret bound of 0 (a/T), 
and this is tight with respect to T. 


3.2 Importance-Weighting Technique for Less Data 


As a starting point, suppose we wish to design an online learning algorithm that does not 
observe all of the arriving loss functions, but still performs well against the entire arrival 
sequence. 

Because the arrival sequence may be adversarially chosen, a good algorithm should 
randomly choose to sample some of the arrivals. In this section, we abstract away the decision 
of how to randomly sample. (This will be the focus of Section |^) In this section, we suppose 
that at each time t, after posting a hypothesis ht, a probability qt > 0 is specihed by some 
external means as a (possibly random) function of the preceding time steps. With probability 
qt, we observe ft] with probability I — qt, we observe nil. 

Our goal is to modify the FTRL algorithm for this setting and obtain a modffied regret 
guarantee. Notice crucially that the definition of loss and regret (|^ are unchanged: We still 
suffer the loss ft{ht) regardless of whether we observe ft. 

The key technique we use is importance weighting. The idea is that, if we only observe 
each of a sequence of values Xt with probability pt, then we can get an unbiased estimate of 
their sum by taking the sum of — for those we do observe. To check this fact, let Ij be the 
indicator variable for the event that we observe i and note that the expectation of our sum is 


E 



Ei x^. This is called importance-weighting the observations (and is a specific 


instance of a more general machine learning technique). Furthermore, if each ^ is bounded 
and observed independently, we can expect the estimate to be quite good via tail bounds. 
The importance-weighted modihcation to an online learning algorithm is outlined in 


It depends on the following key notation. Our analysis and algorithm require a given norm 
II ■ II, and we recall the definition of the dual norm ||z||* := sup 2 ,.||j,||<i x ■ z. 


Algorithm ^ The importance-weighted regret guarantee we obtain is given in Lemma 3.1 
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Definition 3.1. Given h G H, and convex loss / : "H — )■ M, let A^j ■= ||V/(/i)||*. 

We can informally think of Ahj both as the “difficulty” of arrival / when the current 
hypothesis is h, and as the “value” of observing /. This interpretation is explored in Section 
l^when we define the parameter 


ALGORITHM 2: Importance-Weighted Online Learning Algorithm. 
Input; access to Online Learning Algorithm (OLA) 

for t = 1,..., T do 

post hypothesis ht <(— OLA; observe sampling probability qt] 
toss gt-weighted coin (Bernoulli sample) e* ; 


if et 


1 input importance-weighted loss function ft{-) = 
0 input zero function ft{-) = 0 —)■ OLA 


^ OLA 


end 


Lemma 3.1. Assume we implement Algorithm^with nonzero sampling probabilities gi,..., qt- 
Assume the underlying OLA is FTRL (Algorithm^ with regularizer G : "H —IR that is 
strongly convex with respect to || • ||. Then the expected regret, with respect to the loss sequence 
fi,, Jt, is no more than 


RIT) = ^ + 2r]E 
V 


T 






where (3 is a constant depending on Ri and G, rj is a parameter of the algorithm, and the 
expectation is over any randomness in the choices of ht and qt. 


We can recover the classic regret bound as follows: Take each g^ = 1, and note by the 
Lipschitz assumption that each A/^j^ < 1. Then by setting rj = ©(1/a/T), we get an expected 
regret bounded by 0{y/T). 


3.3 The “Online-to-Batch” Conversion 

So far so good: We can convert an online regret-minimization algorithm to use smaller 
amounts of data, and we postpone the question of how to price data till Section We now 
address the statistical learning problem, which is how to generate accurate predictions based 
on the online learning process. 

We address this with a standard tool known as the “online-to-batch conversion,” where 
we may leverage an online learning algorithm for use in a “batch” setting. A sketch of this 
technique is as follows, and further details can be found in, e.g., Shalev-Shwartz [TH]. Given 
a batch of i.i.d. data points, feed them one-by-one into the no-regret algorithm. Because the 
algorithm has low regret, its hypotheses predicted well on average. But since each data point 
was drawn i.i.d., this means that these hypotheses on average predict well on an i.i.d. draw 
from the distribution. Thus it suffices to take the mean of the hypotheses to obtain low risk. 


10 









Lemma 3.2 (Online-to-Batch [7j). Suppose the sequence of convex loss functions fi,..., fT 
are drawn i.i.d. from a distribution and that an online learning algorithm with hypotheses 
hi,... ,hT achieves expected regret R{T). Let C{h) = and h* = argmin/ig^ C{h). 

For hi,T = ^ ^2^=1 ht, we have 

E C(h,,T) < C(h-) + ^R{T). 

alg 

We note that this conversion will continue to hold in the data-purchasing no-regret setting 
we dehne next, since all that is required is that the algorithm output a hypothesis ht at each 
step and that there is a regret bound on these hypotheses. 


4 Regret Minimization with Purchased Data 


In this setting, we define the problem of regret minimization with purchased data. We will 
design mechanisms with good regret guarantees for this problem, which will translate via 


the aforementioned online-to-batch conversion (Lemma 3.2) into guarantees for our original 
problem of statistical prediction. 

The essence of the data-purchasing no-regret learning setting is that an online algorithm 
(“mechanism”) is asked to perform well against a sequence of data, but by default, the 
mechanism does not have the ability to see the data. Rather, the mechanism may purchase 
the right to observe data points using a limited budget. The mechanism is still expected to 
have low regret compared to the optimal hypothesis in hindsight on the entire data sequence 
(even though it only observes a portion of the sequence). 


4.1 Problem Definition 

The data-purchasing regret minimization problem is parameterized by the hypothesis 
space Fi, number of arriving data points T, and expected budget constraint B. A problem 
instance is a sequence of pairs (ci, /i),..., (c-r, fr) where each /^ : "H —)■ R is a convex loss 
function and each q G [0,1] is the cost associated with that data point. We assume that the 
ft are 1-Lipschitz, and let B be the set of such loss functions. 

In this problem, we design a mechanism implementing the operations “post” and “receive” 
and interacting with the problem instance as follows. 

• For each time step t = 1,..., T: 

1. The mechanism posts a hypothesis ht and a pricing function vrt : R, where 

T^t{f) is the price posted for loss function /. 

2. Agent t arrives, possessing {ct,ft). 

3. If the posted price 7it{ft) > Q, then agent t accepts the transaction: The mechanism 
pays Titift) to the agent and receives {ct,ft)- If 'n'tift) < Ct, agent t rejects the 
transaction and the mechanism receives a null signal. 
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Note the key differences from the statistical learning setting: We must post a hypothesis ht 
at each time step (and we do not output a hnal prediction), and data is not assumed to come 
from a distribution. 

The goal of the mechanism is to minimize the loss, namely 'Yht The definition of 

regret is also the same as in the classical setting (Equation]^. Note that we suffer a loss ft{ht) 
at time t regardless of whether we purchase ft or not. The mechanism must also guarantee 
that, for every problem instance (ci, /i),..., (ct, fr), h spends at most B in expectation over 
its own internal randomness. 


4.2 The Importance-Weighting Framework 


Recall that, in Section |3.2[ we introduced the importance-weighting technique for online 
learning. This gave regret guarantees for a learning algorithm when each arrival ft is observed 
with some probability qt- 

Our general approach will be to develop a strategy for randomly drawing posted prices tt*. 
This will induce a probability qt of obtaining each arrival ft- 

Therefore, the entire problem has been reduced to choosing a posted-price strategy at 
each time step. This posted-price strategy should attempt to minimize the regret bound 
while satisfying the expected budget constraint. 

A brief sketch of the proof arguments is as follows. After we choose a posted price strategy, 
each qt will be determined as a function of ht, Ct, and ft- {qt is just equal to the probability 
that our randomly drawn price exceeds the agent’s cost q.) Thus, we can apply Lemma 
which stated that for these induced probabilities qt, the expected regret of the learning 


3.1 


algorithm is 


^+2,,eV^, 

h t 


where /3 is a constant and rj is a parameter of the learning algorithm to be chosen later. 
After we choose and apply such a strategy, the general approach to proving our regret 

bounds is to hnd an a priori bound M such that 2 E < M. Then the regret bound 

becomes ^ -t- gM. If we know this upper-bound M in advance using some prior knowledge, 

then we can choose g = 0(1/\/M) as the parameter for our learning algorithms. This gives a 
regret guarantee of 0{'/M). 


4.3 A First Step to Pricing: The “At-Cost” Variant 

The bulk of our analysis of the no-regret data-purchasing problem actually focuses on a 
slightly easier variant of the setting: If the arriving agent accepts the transaction, then the 
mechanism only has to pay the cost q rather than the posted price Titift)- We call this 
the “at-cost” variant of the problem. This setting turns out to be much more analytically 
tractable: We derive optimal regret bounds for our mechanisms and matching lower bounds. 
We then take the key approach and insights derived from this variant and apply them to 
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produce a solution to the main no-regret data purchasing problem. In order to keep the story 
moving forward, we summarize our results for the “at-cost” setting here and explore how 
they are obtained in Section 

In the at-cost setting, we are able to solve directly for the pricing strategy that minimizes 
the importance-weighted regret bound of Lemma |3.1[ We hrst dehne one important quantity. 


then we state the strategy and result in Theorem 4.1 


Definition 4.1. For a fixed input sequence (ci, /i),..., (ct, /t), ‘^hj in Dehnition 3.1, and 
a mechanism outputting (possibly random) hypotheses hi,, hx, dehne 


where the expectation is over the randomness of the algorithm. Note that 7r,yi lies in [0,1] 
by our assumptions on bounded cost and Lipschitz loss. 

Now we give the main result for the at-cost setting. 

Theorem 4.1. There is a mechanism for the “at-cost” problem of data purchasing for regret 
minimization that interfaces with FTRL and guarantees to meet the expected budget constraint, 
where for a parameter 7 r,^ G [0,1] (Definition\4.1]), 


1. The expected regret is bounded by O ^max |;^7T,yi , \/T 

2. This is optimal in that no mechanism can improve beyond constant factors. 

3. The pricing strategy is to choose a parameter K = O draw Titif) randomly 

according to a distribution such that Pr[7rt(/) > c] = min |l , | . 

The only prior knowledge required is an estimate of 'yT,A up to a constant factor. 


4.4 Interpreting the Quantity ^ t,a 

Several of our bounds rely heavily on the quantity 'Jta which measures, in a sense, the 
“hnancial difficulty” of the problem. We now devote some discussion to understanding 'Jta 
by answering four questions. 

(1) How to interpret '^t,a ? 

'-)t,a is an average, over time steps t, of ^htjt ' Here, ^htjt intuitively captures 
both the “difficulty” of the data ft and also the “value” or “beneht” of ft. To explain the 
difficulty aspect, by examining the regret bound for FTRL learning algorithms {e.g. the 


importance-weighted regret bound of Lemma 3.1 with all qt = 1), one observes that if each 
Ahtjt is small, then we have an excellent regret bound for our learning algorithm; the problem 
is “easy”. To explain the value aspect, one can for concreteness take the Online Gradient 
Descent algorithm; the larger the gradient, the larger the update at this step, and 
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is the norm of the gradient. And in general, the higher more likely we are to 

pnrchase arrival ft. 

Thus, captures the correlations between the value of the arriving data and the cost 
of that data. If either the mean of the costs or the average beneht ^htjt of fh® data is 
converging to 0, then 'yT,A 0 and in these cases we can learn with high accuracy very 
cheaply, as may be expected. More interestingly, it is possible to have both high average 
costs, and high average data-values, and yet still have '^t,a 0 due to benehcial correlations. 

In these cases we can learn much more cheaply than might be expected based on either the 
economic side or the learning side alone. 

(2) When should we expect to have good prior knowledge of ■jt, a? 

Although in general 'Jta will be domain-specihc, there are several reasons for optimism. 
First, 'yT,A compresses all information about the data and costs into a single scalar parameter 
(compare to the common mechanism-design assumption that the prior distribution of agents’ 
values is fully known). Second, we do not need very exact estimates of {e.g. we do not 
need to know Jt,a i e): For order-optimal regret bounds, we only need an estimate within a 
constant factor of Third, 'yT,A is directly proportional to K, which is a normalization 
constant in our pricing distribution: If we increase K, the probability of obtaining a given data 
point only decreases, and vice versa. In fact, the best choice of K is the normalization constant 
so that we run out of budget precisely when the last arrival leaves. Thus, K (equivalently. 
It,a) can be estimated and adjusted online by tracking the “burn rate” (spending per unit 
time) of the algorithm. In simulations, we have observed success with a simple approach of 
estimating K based on the average correlation so far along with the burn rate, i.e. if the 
current estimated 'yT,A is It,a ^-nd there are T steps remaining with B budget remaining to 
spend, set K = 'yf,AT/B. 

(3) What can we prove without prior knowledge 

It turns out that, if we only have an estimate of c = v^’ respectively fv = p 

then this suffices for regret guarantees on the order of Tc/y/B, respectively Ty^/y/B. This 
“graceful degradation” will continue to be true in the main setting. The idea is that we can 
follow the optimal form of the pricing strategy while choosing any normalization constant 
K > '^'3t,a- It may no longer be optimal, but it will ensure that we satisfy the budget and 
give guarantees depending on the magnitude of K. So all we need is an approximate estimate 
of some value larger than 7t,^- Both c and /i are guaranteed to upper-bound on 7t,^, so 
both can be used to pick K while satisfying the budget. 

To recap, knowledge of only a simple statistic such as the mean of the arriving costs 
suffices for good learning guarantees, with better knowledge translating to better guarantees. 

(4) lT,A depends on the algorithm—what are the implications? 

We hrst note that can be upper-bounded by, for instance, where /i is the average 
of the arriving costs. So a bound containing ''yT,A does imply nontrivial algorithm-independent 
bounds. The purpose of y^,^ is to capture cases where we can do signihcantly better than 
such bounds because the algorithm is a good ht for the problem. To see this, note that 
running the FTRL algorithm on the entire data sequence (with no budget constraint) gives a 
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regret bound of ^ + 7] XlLi worst case has each Af^j^ equal to 1 , producing a a/T 

regret bound. But in a case where the algorithm has a small average Ah^j^ and the algorithm 
enjoys a better regret bound, we may also hope that this improvement is reflected in 7 t,. 4 - 

However, one might hope for an algorithm-independent quantity that, in analogy with 
VC-dimension, captures the “difflculty” of the purchasing and learning problem instance. 
This leads to the question: 

(4a) Can we remove the algorithm-dependence of the bound? One might hope to achieve 
a bound depending on an algorithm-independent quantity that captures correlations between 
data and cost. A natural candidate is 7 t^ := 7 fty/ct- In general, there are difficult 

cases where one can not achieve a bound in terms of 7 -f _ 4 . However, in nicer scenarios we 
may expect 7 ^,^ to approximate 7 r_ 4 . For instance, suppose i{h,z) = (fifCz) where 0 is 
a differentiable convex function whose gradient is 1-Lipschitz — commonly-used examples 
include the squared hinge loss and the log loss. Under this condition, where again we are 
using ft{-) := £{■, Zt), we can show that 

AhtjtV^t - Ah*jty/c~t = ||V£(ht,2;t)||*vE-||V£(h*,^t)||*A/Q 

< \\i(j>\hJz,)-cj>\h*^z,))z,\U 

< {(fihjzt) -((){h*^Zt)\ = \i{ht,Zt) -i{h*,zt)\. 

By the regret guarantee of our mechanism when run with a good algorithm, even initialized 
with very weak knowledge, this difference in losses per time step is o(l), implying that 
It, A lf,A- A deeper investigation of this phenomenon is a good candidate for future work. 

4.5 Mechanisms and Results for Regret Minimization 

In the previous section, we presented our results for the easier “at-cost” variant. We now 
apply the approach derived for that setting to the main regret minimization problem. 

For this problem, unlike in the “at-cost” variant, we cannot in general solve for the form 
of the optimal pricing strategy. This is intuitively because, when we must pay the price we 
post, the optimal strategy depends on q. But the algorithm cannot condition the purchasing 
decision directly on q, as this is private information of the arriving agent. 

We propose simply drawing posted prices according to the optimal strategy derived for 
the at-cost setting, namely. 


Pr[7ri(/) > c] = min 1 , 




(4) 


but with a different choice of normalization constant K. We note that there is a pricing 
distribution that accomplishes this: 


Observation 1. For any K and A^^j, there exists a pricing distribution on Titif) that 
satisfies Equation^ Letting c* = Al^j/K‘^, the CDF is given by F{7r) = Pr[ 7 ri(/) < vr] = 0 
if IT < c*, F{n) = 1 — Ah^,f/K^/^T if c* < tt < 1, and F{n) = 1 if n > 1. 
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(a) Probability density function of the pricing dis¬ 
tribution. The price 7r(/) = 1 with probability 
min {1, j/isT}. On the interval (c*,l) the den¬ 
sity function is x Ahtj l2Kx^^‘^. 



(b) Cumulative distribution function of the pricing 
distribution. Equal to zero for tt < c*, then equal to 
1 — Ahtj /K^/tt on (c*, 1), then equal to 1 at cost 1. 


Figure 2: The pricing distribution. Illustrates the distribution from which we draw our posted 
prices at time t, for a fixed arrival /. The quantity A/^ j captures the “benefit” from obtaining /. 
iC is a normalization parameter. The distribution’s support has a lowest price c*, which has the 
form c* = 


The pricing distribution is given in Figure]^ This strategy gives Mechanism]^ 

As in the known-costs case, our regret bounds depend upon the prior knowledge of the 
algorithm. It will turn out to be helpful to have prior knowledge about both 'Jt,a s-nd the 
following parameter, which can be interpreted as 'Jt,a with all costs q = 1 : 

t 

Theorem 4.2. If Mechanism^ is run with prior knowledge of'^ t,a 0 / 7 ™^ (up to a 

eonstant faetor), then it ean choose K and rj to satisfy the expected budget constraint and 
obtain a regret bound of 

o(ma.{Ag. yy}), 

where g = ^'^t,a ' 1t(a (^V setting K = ^ 77 ^“/ Similarly, knowledge only of'fT,A, respec¬ 
tively c = ^ respectively /i = 7 suffices for the regret bound with g = 

respectively g = \/d, respectively g = 


We can observe a quantihable “price of strategic behavior” in the difference between the 
regret guarantees of Theorems 4.2 (this setting) and Theorem 4.1 (the “at-cost”) setting: 


T 

71 



vs 


T 

71 


lT,A- 


Note that 7 ™^ > 'It,Ai and they approach equality as all costs approach the upper bound 1, 
but become very different as the average cost /i —)■ 0 while the maximnm cost remains hxed 
at 1 . 
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Mechanism 3: Mechanism for no-regret data-pnrchasing problem. 

Input: parameters K, r], access to online learning algorithm (OLA) 
set OLA parameter r]; 
for t = 1,..., T do 

post hypothesis ht -(r- OLA; 

post prices Titif) drawn randomly snch that Pr[7rt(/) > c] = min |l , | ; 

if we receive (q, ft) then 
let qt = PT^,[7it{ft) > q]; 
let importance-weighted loss function /*(•) = 

send ft —)■ OLA; 
else 

I send 0 fnnction —OLA; 

end 

end 


Comparison to lower bound. Our lower-bound for the data purchasing regret minimiza¬ 
tion problem is O ((follows from the lower bound for the at-cost setting, Theorem 


6.2). So the difference in bounds discussed above, a factor of a/Tt^ versus a/tDA? is the only 
gap between our upper and lower bounds for the general data purchasing no regret problem. 

The most immediate open problem in this paper is close this gap. Intuitively, the lower 
bound does not take advantage of “strategic behavior” in that a posted-price mechanism may 
often have to pay signihcantly more than the data actually costs, meaning that it obtains less 
data in the long run. Meanwhile, it may be possible to improve on our upper-bound strategy 
by drawing prices from a different distribution. 


5 Results for Statistical Learning 

In this section, we give the final mechanism. Mechanism for the data purchasing statistical 
learning problem. The idea is to simply run the regret-minimization Mechanism on the 
arriving agents. At each stage. Mechanism posts a hypothesis ht- We then aggregate these 
hypothesis by averaging to obtain our hnal prediction. 


Mechanism 4: Mechanism for statistical learning data-pnrchasing problem. 
Input: parameters K,rj, access to OLA 

identify each data point z with the loss function /(■) = if, z)] 
run Mechanism with parameters r],K and access to OLA; 
let hi,... ,hT be the resulting hypotheses; 
output h = ^ J2t 
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Theorem 5.1. Mechanism^ guarantees spending at most B in expectation and 

EC{h) < C{h*)+ 0 , V^}) ’ 


where g = ' It A’ (assuming that '')t,a 7 ™“ are known in advance up to a constant 

factor. 

If one assumes approximate knowledge respectively of 'Jt,a, of c = ^ or of 

IX = ^ then the guarantee holds with respectively g = y/fffA, g = \/d, or g = 


Proof. By Theorem 
when run with the 


4.2 


Mechanism 


guarantees an expected regret of O (max 


specified prior Knowledge for the specihed values of g. 


online-to-batch conversion of Lemma 3.2 proves the theorem. 



Therefore, the 


□ 


The statement of Main Result is the special 


A detailed discussion of 'yT,A is in Section 4.4 


case where only 7 ^,^ is known and g = ^'yT,A- 


6 Deriving Pricing and the “at-cost” Variant 


In Section 4.3, we stated our results for the easier at-cost variant of the regret minimization 


with purchased data problem. This included the posted-price distribution that we use for our 
main results. In this section, we show how these results and this distribution are derived. 
The “at-cost” variant is formally dehned in exactly the same way as the main setting, except 
that when tt* > c* and the transaction occurs, the mechanism only pays the cost q rather 
than the posted price vr^ 

We hrst show how our posted-price strategy is derived as the optimal solution to the 
problem of minimizing regret subject to the budget constraint. The resulting upper bounds 
for the “at-cost” variant were given in Theorem 4.1, Then, we give some fundamental lower 


bounds on regret, showing that in general our upper bounds cannot be improved upon here. 
These lower bounds also hold for the main no-regret data purchasing problem, where there is 
a small gap to the upper bounds. 


6.1 Deriving an Optimal Pricing Strategy 

We begin by asking what seems to be an even easier question. Suppose that for every pair 
(q, ft) that arrives, we could hrst “see” (c*, ft), then choose a probability with which to obtain 
{ct, ft) and pay q. What would be the optimal probability with which to take this data? 


Lemma 6.1. To minimize the regret bound of Lemma 3.1 . the optimal choice of sampling 
probability is of the form qt = min{l , A^^jJK* y/cy} . The normalization factor K* ^ ^7 t,a- 


The proof follows by formulating the convex programming problem of minimizing the 
regret bound of Lemma [TT] subject to an expected budget constraint. It also gives the form of 
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the normalization constant K*, which depends on the input data sequence and the hypothesis 
sequence. 

The key insight is now that we can actually achieve the sampling probabilities dictated by 
Lemma 6T using a randomized posted-price mechanism. Notice that these optimal sampling 
probabilities are decreasing in q. In general, when drawing a price from some distribution, 
the probability that it exceeds c will be decreasing in c. So it only remains to hnd the 
posted-price distribution that actually induces the sampling probabilities that we want for all 
c simultaneously. That is, by randomly drawing posted prices according to our distribution, 
we choose to purchase (q, ft) with exactly the probability qt stated in Lemma 6.1, for any 
possible value of q and without knowing (q,/*). 

Thus, our hnal mechanism for the at-cost variant is to simply apply Mechanism but 
only pay the cost of the arrival rather than the price we posted. We set K = Note 

that this choice of normalization constant K is different from the main setting because we 
on average pay less in the at-cost setting; this leads to the difference in the regret bounds. 
Our main bound for the at-cost variant was given in Theorem |4.1[ An open problem for this 


setting is whether one can obtain the same regret bounds without any prior knowledge at all 
about the arriving costs and data. 


6.2 Lower Bounds for Regret Minimization 

Here, we prove lower bounds analogous to the classic regret lower bound, which states that 
no algorithm can guarantee to do better than 0{\/T). These lower bounds will hold even in 
the “at-cost” setting, where they match our upper bounds. An open problem is to obtain a 
larger-order lower bound for the main setting where the mechanism pays its posted price. 
This would show a separation between the at-cost variant and the main problem. 

First, we give what might be considered a “sample complexity” lower bound for no-regret 
learning: It specializes our setting to the case where all costs are equal to one (and this 
is known to the algorithm in advance), so the question is what regret is achievable by an 
algorithm that observes B of the T arrivals. 

Theorem 6.1. Suppose all costs Ct = 1. No algorithm for the at-cost online data-purchasing 
problem has regret better than 0{T/\fB); that is, for every algorithm, there exists an input 
sequence on which its regret is Vt{T/\/lf). 

Proof Idea: We will have two coins, with probabilities ^ ± e of coming up heads. We will 
take one of the coins and provide T i.i.d. flips as the input sequence. The possible hypotheses 
for the algorithm are {heads, tails} and the loss is zero if the hypothesis matches the flip and 
one otherwise. The cost of every data point will be one. 

The idea is that an algorithm with regret much smaller than Te must usually predict 
heads if it is the heads-biased coin and usually predict tails if it is the tails-biased coin. 
Thus, it can be used to distinguish these cases. However, there is a lower bound of O (^) 
samples required to distinguish the coins, and the algorithm only has enough budget to gain 
information about 0{B) of the samples. Setting e = 1 /a/H gives the regret bound. □ 
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We next extend this idea to the case with heterogeneous costs. The idea is very simple: 
Begin with the problem from the label-complexity lower bound, and introduce “useless” data 
points and heterogeneous costs. The worst or “hardest” case for a given average cost is 
when cost is perfectly correlated with beneht, so all and only the “useful” data points are 
expensive. 


Theorem 6.2. No algorithm for the non-strategic online data-purchasing problem has expected 


regret better than O \^t,aT/ that is, for every ^t,Aj for every algorithm, there exists 
a sequence with parameter '')t,a on which its regret is O ( '^t,aT/). Similarly, for c = 


^ 'J~o ond /i = ^ Otj ''^0 have the lower bounds (t cj a/B j and (t yfl/ a/B j . 


7 Examples and Experiments 

In this section, we give some examples of the performance of our mechanisms on data. We 
use a binary classihcation problem with feature vector x and label y G { — 1,1}. The 
dataset is described in Figure 



(a) Visualizing the classification problem 
without costs. 



(b) A brighter green background corre¬ 
sponds to a higher-cost data point. 


Figure 3: Dataset. Data points are images of handwritten digits, each data point consisting of a 
feature vector x of grayscale pixels and a label y, the digit it depicts. We use the MNIST handwritten 
digit dataset (http://yann.lecun.com/exdb/mnist/). The algorithm is asked to distinguish between 
two “categories” of digits, where “positive” examples are digits 9 and 8 and “negative” examples 
are 1 and 4 (all other digits are not used). The number of training examples is T = 8503. This task 
allows us to adjust the correlations by drawing costs differently for different digits. 


The hypothesis is a hyperplane classifier, i.e. vector w where the example is classified as 
positive if ic ■ a; > 0 and negative otherwise; the risk is therefore the error rate (fraction of 
examples misclassified). For the implementation of the online gradient descent algorithm, we 
use a “convexified” loss function, the well-known hinge loss: l{w, (x, y)) = max{0, l—y{w-x)} 
where y G ( — 1,1}. 

In our simulations, we give each mechanism access to the exact same implementation of 
the Online Gradient Descent algorithm, including the same parameter rj chosen to be 0.1/c 
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(a) A comparison of mechanisms. “Naive” offers a 
maximum price of 1 to every arrival until out of 
budget. “Ours” is Mechanismwith K initialized to 
0 and then adjusted online according to the estimated 
average 'Jt,a on the data so far. “Baseline” obtains 
every data point (has no budget constraint). Costs 
are distributed uniform (0,1) independently. Each 
datapoint is an average of 4000 trials, with standard 
error of at most 0.0002. 



(b) An illustration of the role of cost-data correlations. 
The marginal distribution of costs is 1 with probability 
0.2 and free otherwise, but the correlation of cost and 
data changes. The performance of Naive and the 
Baseline do not change with correlations. The larger- 
lT,A case has high-cost points consisting of only 4s 
and 9s, while 'yT,A is smaller when costs and data are 
independent. Each datapoint is an average of 2000 
trials, with standard error of at most 0.0004. 


Figure 4: Examples of mechanism performance. 


where c is the average norm of the data feature vectors. We train on a randomly chosen half 
of the dataset and test on the other half. 

The “baseline” mechanism has no budget cap and purchases every data point. The “naive” 
mechanism offers a maximum price of 1 for every data point until out of budget. “Ours” is 
an implementation of Mechanism We do not use any prior knowledge of the costs at all: 
We initialize K = 0 and then adjust K online by estimating 'yT,A from the data purchased so 
far. (For a symmetric comparison, we do not adjust rj accordingly; instead we leave it at the 
same value as used with the other mechanisms.) The examples are shown in Figure]^ 

8 Discussion and Conclusion 

8.1 Agent-Mechanism Interaction Model 

Our model of interaction, while perhaps the simplest initial starting point, involves some 
subtleties that may be interesting to address in the futnre. A key property is that we need 
to obtain both an arriving agent’s data point 2 ; and her cost c. The reason is that the cost 
is used to importance-weight the data based on the probability of picking a price larger 
than that cost. (The cost report is also required by [16] for the same reason.) As discussed 
in Section a naive implementation of this model is incentive-compatible but not strictly 
so. Exploring implementations, snch as the trnsted third party approach mentioned, is an 
interesting direction. For instance, in a strictly truthful implementation, the arriving agent 
can cryptographically commit to a bid, e.g. by submitting a cryptographic hash of her cost. 
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Then the prices are posted by the mechanism. If the agent accepts, she reveals her data and 
her cost, verifying that the cost hashes to her commitment. It is strictly truthful for the 
agent to commit to her true cost. 

This paper focused on the learning-theoretic aspects of the problem, but exploring the 
model further or proposing alternatives is also of interest for future work. 

8.2 Conclusions and Directions 

The contribution of this work was to propose an active scheme for learning and pricing data 
as it arrives online, held by strategic agents. The active approach allows learning from past 
data and selectively pricing future data. Our mechanisms interface with existing no-regret 
algorithms in an essentially black-box fashion (although the proof depends on the specihc 
class of algorithms). The analysis relies on showing that they have good guarantees in a 
model of no-regret learning with purchased data. This no-regret setting may be of interest in 
future work, to either achieve good guarantees with no foreknowledge at all other than the 
maximum cost, or to propose variants on the model. 

The no-regret analysis means our mechanisms are robust to adversarial input. But in 
nicer settings, one might hope to improve on the guarantees. One direction is to assume that 
costs are drawn according to a known marginal distribution (although the correlation with 
the data is unknown). A combination of our approach and the posted-price distributions of 
Roth and Schoenebeck [16] may be fruitful here. 

Broadly, the problem of purchasing data for learning has many potential models and 
directions for study. One motivating setting, closer to crowdsourcing, is an active problem 
where data points consist of pairs (example, label) and the mechanism can offer a price 
for anyone who obtains the label of a given example. In an online arrival scheme, such a 
mechanism could build on the importance-weighted active learning paradigm |1|. 
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Appendix 


A Tools for Converting Regret-Minimizing Algorithms 

Lemma A.l (Lemma |3.1[ ). Assume we implement Algorithm with nonzero sampling 
probabilities qi,... ,qT- Assume the underlying OLA is FTRL (Algorithm^ with regularizer 
G : H —)■ R that is strongly convex with respect to || ■ ||. Then the expected regret, with respect 
to the loss sequence fi, ■ ■ ■, fr, is no more than 


Rm = ^ + 2r]E 
V 


A-it=l qt 


where (3 is a constant depending on RL and G, rj is a parameter of the algorithm, and the 
expectation is over any randomness in the choices of ht and qt- 


Proof. Let h* = ft{h). We wish to prove that 




+ R 


where {ht, qt] is shorthand for {hi, gi,..., hx, qr] and 


R = — + 2rj E 

V {ht,qt} 



t 



Qt 


As a prelude, note that in general these expectations could be quite tricky to deal with. We 
consider a hxed input sequence fi, ■ ■ ■, fr, but each random variable qt, ht depends on the 
prior sequence of variables and outcomes. However, we will see that the nice feature of the 
importance-weighting technique of Algorithm helps make this problem tractable. 

Some preliminaries: Dehne the importance-weighted loss function at time t to be the 
random variable 


m 



obtain ft 

o.w. 


Let It be the indicator random variable equal to 1 if we obtain ft, which occurs with 
probability qt, and equal to 0 otherwise. Then notice that for any hypothesis h. 


ft{h) = R 


ftih) 


Qt 


E 

It L 


ft{h) I qt = ft{h). 


(5) 


To be clear, the expectation is over the random outcome whether or not we obtain datapoint 
ft conditioned on the value of qp, and conditioned on the value of qt, by dehnition we obtain 
datapoint ft with probability qt and obtain the 0 function otherwise. 
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Now we proceed with the proof. For any method of choosing qi,... ,qj’ and any resulting 
outcomes of It, Algorithm reduces to running the Follow-the-Regularized-Leader algorithm 
on the sequence of convex loss functions fi,..., Jt- Thus, by the regret bound proof for 


FTRL (Lemma A.2), FTRL guarantees that for every fixed “reference hypothesis” h G 


E/<(*«) sE/<(>') + -R 


where 




2 

htjt 


V 


+ 2r]'^l 


A2 

^ 2 
% 


(Recall that Ahj = ||V/(h)||*.) Now we will take the expectation of both sides, separating 
out the expectation over the choice of qt, over ht, and over 


V E 

h. r. 


ht,qt 


E 

It L 


ft{ht) \ht,qt 


< 


V E 

h. r 


ht,qt 


E 

It L 


/i(h) \ht,qt 


+ E 


E 

{It} L 


R I {ht,qt} 


Use the importance-weighting observation above ([^: 


E E/<('») + R 

{huqt} ^ ^ 


where 



+ 2r] 

{ht,qt} 



A2 

qt 


In particular, because this holds for every reference hypothesis h, it holds for h*. 


□ 


Lemma A.2. Let G he 1-strongly convex with respect to some norm || • ||. The regret of Follow- 
The-Regularized-Leader algorithm with regularizer G and convex loss functions /i,..., /^ can 
be bounded by 

' t 

where ft is the upper bound ofGf). 

Proof. We reproduce the standard proof. First, the regret of Follow-The-Regularized-Leader 
can be bounded by 


- R(h,)) + Tieih,,!,) - 

Pi 
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Below we show that i{ht, ft) - i{ht+i, ft) < 2r]\\Vi{ht, ft)\\l. 

Dehne $t(h) = R{h)/rj + fi)- By dehnition, we know ht = argmin/j 

Since ^{■) is convex and R{ ) is 1-strongly convex, we know ‘ht(-) is (l/r 7 )-strongly convex for 
all t. Therefore, since ht+i minimizes <ht, by dehnition of strong convex, we get 

^t{ht) > ^t{ht+i) + X— ||hf — 
zr] 

After simple manipulations, we get 

\\ht-ht+if <2r]{^t{ht)-Mht+i)) 

= 2T]{^t-i{ht) - ^t-iiht+i)) + 2T]{i{ht, ft) - iiht+i, ft)) 

< ft) - i{ht+i,ft)) 

The last inequality comes from the fact that ht is the minimizer of $*_!. 

Since £{■) is convex, we have 

i{hu ft) - i{ht+,, ft) < {ht - ht+i)Vi{ht, ft) 

<\\ht-ht+i\\\\Vi{ht,ft)\U 

The last inequality comes from the generalized Cauchy-Schwartz inequality. 

Combining the above two inequalities together, we get 

e(h„f,)-t{h,+i,f,) < \\^t{h,J,)\UV2riWh.J,) - Hh,+uf,)) 

By squaring and shifting sides, 

i{htJt)-i{ht+uft)<2v\\Vi{htJt)\\l 

The proof is completed by inserting the inequality into the regret bound. □ 


B No regret “at-cost” setting 


B.l At-cost upper bounds 


Lemma B.l (Lemma 6.1). To minimize the regret bound of Lemma 3.1, the optimal choice 
of sampling probability is of the form qt = min {1 , \f^t} ■ The normalization factor 


K* 




Proof. Recall that the regret bound of Lemma 3.1 


IS 


V t 


where qt is the probability with which we choose to purchase arrival (c*, ft). We will solve for 
the choices of qt for each t. 
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Since /3 is a, constant and rj a parameter to be tuned later, our problem is to minimize the 
summation term in this regret bound. This yields the following optimization problem; 

minV^ 

s.t. '^qfCt<B 

t 

g* < 1 (Vt). 


The first constraint is the expected budget constraint, as we take each point (q, ft) with 
probability qt and pay q if we do. The second constrains each qt to be a probability. 

To be completely formal, our goal is to minimize the expectation of the summation in 
the objective, as each ht and qt are random variables (they depend on the previous steps). 
However, our approach will be to optimize this objective pointwise: For every prior sequence 
hi,... ,ht and qi,..., qt-i, we pick the optimal qt. Therefore in the proof we will elide the 
expectation operators and argument. Similarly, since the budget constraint holds for all 
choices of qt that we make, we elide the expectation over the randomness in qt. 

The Lagrangian of this problem is 


{qt: cxt}) — "^2 —^ + ~ oq jqt — i) 


qt 


with each A, qt, at > 0. At optimum. 


oqt 


-^ A A 

qt 


implying that 

V Act + at 

By complementary slackness, at{qt — 1) = 0 at optimum, so consider two cases. If at > 0, 
then qt = 1. On the other hand, if qt < 1, then at = 0. Thus we may more simply write 


qt = min 


1 


^htjt \ 

V^t I 


Therefore, our normalization constant K* = \/A. To solve for A, by complementary slackness, 
A ~ B) = Q. If A = 0, then the form of qt and prior discussion implies that all 

qt = 1, and we have in other words, we have enough budget to purchase every 

point. Otherwise, the budget constraint is tight and Yht^t ■ Ct = B, so 



Let us call those points that are taken with provability g* = 1 “valuable” and the others “less 
valuable”, and let S be the set of less valuable points, S = {t : qt < 1}. Then we can rewrite 
as 


so 


t^S 

K* = VX = 


E 

teS 


Vx 


= B, 


B-E 


t^s 


Ct 




tes 


This completes the proof. Let us make several hnal comments and observations, however. 
First, if the budget is small relative to the amount of data, then with Lipschitz loss functions, 
no data points will be taken with probability = 1, so S' will eqnal all of T. In this case, 
the expectation of K* is exactly which is the meaning of our informal statement 

K* ^ 

Second, this K* is optimal “pointwise”, in that it inclndes advance knowledge of which 
data points will be taken and which hypotheses will be posted. However, notice that, to 
satisfy the budget constraint, it suffices to take the expectation and choose a normalization 
constant 


K = E 


B-E 


t^s 


Ct 


tes 


Third, as noted above, the extreme case is when all < 1 and in this case the above 
K = ^'yT,A exactly. While this will not be “as optimal” for the specihc random outcomes of 
this sequence, it will snffice to prove good npper bonnds on regret. Fnrthermore, it holds 
that any choice of iF > '^'Jt,a satishes the expected budget constraint; and (by setting rj as 
a function of K) suffices to prove an npper bound on regret. □ 


Theorem B.l (Theorem 4.1). There is a meehanism for the “at-eost” problem of data 


purehasing for regret minimization that interfaees with FTRL and guarantees to meet the 
expected budget constraint, where for a parameter 7 t,^ G [0,1] (Definition 

1. The expected regret is bounded by O ^max ^-^7t,a , ■ 

2. This is optimal in that no mechanism can improve beyond constant factors. 

3. The pricing strategy is to choose a parameter K = O [^'Jt,a) CLnd draw Titif) randomly 

according to a distribution such that Pr[7ri(/) > c] = min |l , • 

The only prior knowledge required is an estimate of 'yT,A up to a constant factor. 


Proof. The lower bound proof appears in Theorem 6.2 


For the upper bound, we will give a more careful argument hrst, obtaining a more subtle 
bound captnring the two extremes in the regret bonnd as well as the spectrnm in between. 
We will then simplify to get the theorem statement. 


29 











First, note as pointed out in the proof of Lemma [ g^ that choosing any K > ^ IE[iF*] 

satisfies the expected budget constraint, as each probability of purchase qt only decreases. 
We now just need to show that if we know to within a constant factor larger, i.e. set 
K = 0 and T] appropriately, then we achieve the regret bound. 

By Lemma |3.1 for any choices of qt and the learning parameter t], the regret bound 
satisfies 

B 

Regret < — \-2r]¥. ,6) 


T] 

where /3 is a constant. Our strategy is to set 


qt = mm < 1 


qt 


htjt \ 


A 


Recall from the proof of Lemma |6.1| that in the optimal solution there were in general 
“valuable” points for which the probability of purchase was qt = I and “less-valuable” points 
where qt < 1. We had S = {t : qt < 1}. Thus the summation term in the regret bound 
becomes 

(7) 

t^s tes 

Before we prove the theorem statement, let us show how to achieve the more subtle bound. 
So for the sake of this argument, let 7T,yi(‘^) “ ■[;li ® approximate the 

more precise form derived in the proof of Lemma |6.1t that is. 


Ks = 0 


1^1 


B-E 


t^s 


Ct 


-7t,a{S) 


Then the summation term of the regret bound (Expression is at most a constant times 


t^S 


+ 


<T-\S\ + + 


B Et^s 

\S? 


B-E 




t^S 


Ct 


( 8 ) 


as each < 1. It remains to select the parameter g to use for the learning algorithm and 

plug into the original regret bound. Expression If the algorithm has an accurate estimate 
of Ks, I S'I, and then it can set g equal to the square root of one over Expression 

^ (Note this may be achievable by tuning g online as well, perhaps even with a theoretical 
guarantee.) In this case, the regret bound is 


Regret < O [ a T — \S\ + 


|S|^ 


B Et^s ^ 

Note that as R —)■ 0, [S'! —?• T, and as R —)■ Et^t, IS"! —?• 0. 


7ta(‘8')" 
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Now let us actually prove the Theorem as stated. Let 7 ^,^ = Yht and let 

K = The summation term in the regret bonnd, Expression is upper-bounded by 

T+(Tjt,a)-^ 

T2 


= T + —7^ 


A 


nsing that T'Jt,a > since it is a snmmation over more (positive) terms. Now 

by Expression 

Regret < ^ + 2r] (t + ^7r,^ 


Setting 

7 = 0 ^ 1 /max 

gives a regret bonnd of the order of I/ 7 . 


{at, 


B 

T 

Vb 


It,A 


□ 


B.2 At-cost lower bounds 


Theorem B.2 (Theorem 6.1). Suppose all costs Ct = 1. No algorithm for the at-cost online 
data-purchasing problem has regret better than 0{T/\/B); that is, for every algorithm, there 
exists an input sequence on which its regret is Vt{T/\fB). 


Proof. Consider two possible input distributions: i.i.d. flips of a coin that has probability 
I -|- e of heads, or of one with probability \ — e. 

It will snffice to prove the following: 

Claim 1: If there is an algorithm with budget B and expected regret at most Te/ 6 , then 
there is an algorithm to distinguish whether a coin is e-heads-biased or e-tails-biased with 
probability at least 2/3 nsing 185 coin flips. 

This claim implies the theorem because it is known that distinguishing these coins requires 
n ( 1 /e^) coin flips; in other words, it implies that e > fl (^l/^/B^, so the algorithm’s expected 

regret mnst be ^T/a/ 5j. 

We prove Claim 1 by proving the following two claims: 

Claim 2: If an algorithm’s expected regret is at most Te/ 6 , then under the e-heads-biased 
coin, with probability at least 5/6, it outputs the heads hypothesis more times than the tails 
hypothesis. (And symmetrically under the tails-biased coin.) 

Claim 3: An algorithm in this coin setting with bndget 5 can, with probability at least 
5/6, be simnlated for T ronnds nsing at most 185 coin flips - in the sense that its behavior 
is identical to its behavior on a full sequence of T coin flips. 

Proof of Claim 1 from 2 and 3. We will take an algorithm with bndget 5 and regret Te 
and nse it to distingnish the coin nsing 185 coin flips: Using Claim 3, we can simnlate the 
algorithm’s behavior for all T ronnds using at most 185 coin flips, except with probability 
1/6. Then, if the algorithm used the hypothesis heads more times than tails, we gness that 
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the coin is heads-biased, and symmetrically. By Claim 2, onr gness is correct except with 
probability 1/6. By a union bound, therefore, this procedure correctly distinguishes the coin 
except with probability 1/3, proving Claim 1. 

Proof of Claim 2. Suppose the coin being flipped is the heads-biased coin; everything that 
follows will hold symmetrically for the tails-biased coin. Now, suppose that the algorithm 
outputs the hypothesis tails for M of the T rounds. Since each round is an independent coin 
toss, if the hypothesis is tails then its expected loss on that round is | + e; if heads, | — e. 
This gives an expected loss of M (| -|- e) + {T — M) (| — e) = f + (2M — T)e. 

Meanwhile, the expected loss of the optimal hypothesis is at most T (| — e), since this is 
the expected loss of the heads hypothesis. Therefore, the algorithm’s expected regret, if it 
outputs the hypothesis tails M times on average, is at least 


T 

2 


(2EM-T)e-T 



2EMe. 


If the algorithm’s regret is at most Te/6, then this implies that 2EMe < Te/6, or 
EM < T/12. Thus by Markov’s inequality, the probability that half or more of the 
hypotheses are tails is bounded by 


, , , EM 

PrlAf > r/2| < ^ 

< 1 / 6 . 


Proof of Claim 3. Here, we assume that e < 1/6, or H is larger than a (relatively small) 
constant. 

On each data point, there are four possible menus: whether to buy or not to buy if 
the point is a heads or is a tailsj^ If the menu is (don’t buy, don’t buy), then no coin flip 
is needed (the behavior of the algorithm is identical whether the coin is actually flipped 
or not). Otherwise, the coin must be flipped, but the algorithm buys the data point with 
probability at least ^ — e > | (the lowest probability of the remaining three menus). Thus 
the expected number of flips needed before the budget is exhausted is at most 35, and by 
Markov’s inequality, the probability that it exceeds 185 is at most 1/6. □ 


Theorem B.3 (Theorem 6.2). No algorithm for the non-strategic online data-purchasing 
problem has expeeted regret better than O ; that is, for every for every 

algorithm, there exists a sequence with parameter 'yT,A on which its regret is 0 (^3t,aT /j ■ 

Similarly, for c = ^ o.nd p- = ^ we have the lower bounds O {Tc/ and 

o (t^/Vb' 


•^The algorithm may make this a randomized menu, but we can simply consider the outcome of that 
random menu. 
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Proof. We reduce to the previous theorem. Consider the following distribution on input 
sequences. There are three possible data points: heads, tails, and “no coin”. There are still 
two hypotheses, heads and tails. Both have loss 1 on the “no coin” data point. 

Now hx any 'yT,A ^ [0, !]• We will hrst send (1 — data points, all of which are “no 

coin”. The loss of either hypothesis on all of these points is 1, and the cost of these points is 
zero. Then, we will choose either the e-heads-biased or e-tails-biased coin, with e = 
and send T' = '')t,aT coin flips, just as in the previous proof. 

Because the hrst {1 — '^t,a)T points are irrelevant to the regret, the regret of any algorithm 
is simply its regret on these hnal T' data points, which by the previous proof is at least on 
the order of T'e = T/^/B = -fT,AT/^- 

Now to check that the parameter 'yT,A chosen above really is the 'yT,A value of the 
data sequence, note that the convexihed hypothesis space for this problem is the space of 
distributions p EMf on {heads, tails}, with loss 1 -p- (1, 0) if the coin is heads or 1 —p-{0, 1) if 
the coin is tails. The gradient of the loss on either point for all p is (1, 0) or (0,1) respectively, 
and both have norm 1. So ^htjt = 1 “heads” and “tails” data points. Thus we have 

that ~T 'yT,A- 

Finally, noting that 'Jt,a = c in this case gives the bound containing c. For the lower 
bound with p, take the exact construction in Theorem hT and let each point have Ct = p 
instead of q = 1. □ 


C No regret — main setting 


Theorem C.l (Theorem 4.2). If Mechanism^ is run with prior knowledge of'yT,A of 
It A ('^P ^ constant factor), then it can choose K and p to satisfy the expected budget 

constraint and obtain a regret bound of 


O 


^max 






where g = (^V setting K = ^'Jt)a)- Similarly, knowledge only of pt,a, respec¬ 
tively c = ^ respectively p = ^ Ct suffices for the regret bound with g = y/ppA, 

respectively g = respectively g = p^^^. 


Proof. The proof will proceed by hnding a close-to-optimal value K of the normalizing 
constant by considering the budget constraint, then plugging this into the regret term to 
get a bound. The constant maximum price plays into this proof in a slightly non-obvious 
way. Because of this, instead of setting this maximum price equal to 1, we consider the 
generalization where costs may he in [t),Cmax]- 

Consider time t when (q, ft) arrives. Recall that the approach at time t is to draw a price 
for ft from the distribution where 

At{c) = Pr[price > c] = min 


^htjt \ 

KV~c]' 
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Consider then the induced posted-price distribution, which is pictured in Figure It has a 
point mass at Cmax of probabilit}|^ ^htjt / ^y/cmax ■ Otherwise, it is continuous on the interval 
[c*, Cmax] with density 




^htjt 

2iF7r3/2 ’ 


and the lower endpoint c* satishes At{c*) = 1, i.e. c* = 

We hrst hnd the bound on K such that the expected budget constraint is satished. The 
expected amount spent on arrival t can be computed as follows. 


Cmax Pr [price = c 


X (pdf at x) dx 


Cn 


+ 


' max{ct,c*} 
^htjt 


X- 


K^Cmax J rasD^{ct,c*} 


dx 


A 


htjt 


K 


\/Cmax T 


' max{ct,c*} 


2^ 


dx 


A 


(2^yCmax - A/max{Ct,C*}j . 


Now let Cj be the value of c* for arrival t (to distinguish its value in different timesteps). 
By the budget constraint, we need to pick K so that 


E [spend on arrival (q, ft)] < B, 

t 


so 

E ^ {2^Jcmax - A/max{ci,c*}j < B. 

t 

Now we make a simplihcation: If we substitute q in for max{Q, c*}, then the left-hand side 
only increases. Thus, to satisfy the previous inequality, it suffices to choose K to satisfy 

t 


Thus, we let 





'y ^ ^htjt (2Cmax 
t 


V^t). 


Recall our dehnition of the “difficulty-of-the-input” parameter 


It,A = 



t 


^If this quantity is greater than 1, then we post a price of Cmax for this datapoint, and what follows will 
only be a looser upper bound on the amount spent. 
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and let 




Then we have 


T 


Krmn = ^{2^^:X-lTA) 


We now have the setnp to qnickly derive bonnds snch as the theorem statements. Note that 
any choice oi K > Kmin satisfies the expected bndget constraint. 

For the first regret bound, suppose that we know both 7 ^^ up to a constant 

factor. Then we can set K = 0{Kmin)- By Lemma 3.1, the expected regret is bounded by 


Regret < — + 7 

- r] ’^Met) 


where /? is a constant and rj will be chosen later. 

As in the known-costs scenario, let us split into those arrivals that we purchase with 
probability 1 (this corresponds to q < c^) and the others, letting S = {t ■. At{ct) < 1}. Then 
the summation term in the regret bound is bounded by a constant times 


t^S t£S 

<T + '^1T,A ( 27 ““ - 7t,a) (9) 

where we have used the Lipschitz assumption on the loss function — 1- 

As 7 ™“ > 7 t,. 4 , we do not lose much by taking the upper bound 

M = T + ■ 7?;s. (10) 


Now we can choose rj = 0(1/M) and obtain our regret bound of 


Regret < O ^a/M 


< 


O ^max I a/T , 


T 

71 


^,max 

It, A ■ It, A ' 


The other regret bounds will all follow by (1) upper-bounding — \/^max\ (2) letting 
A' = ^y/Cmax\ (3) upper-bouuding 'Jt,a] (4) setting g appropriately. Note that this 
can only increase K, so the expected budget constraint is still satisfied. The modifications 
simply give a different bound in Expression from which the rest of the argument follows 
analogously. 


From (1) and (2), Expression 10 becomes 


M = T + 2 — gT,A\/Cniax- 

ID 
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First, if we know 7 t,^, then picking rj = 0(1/M) gives the corresponding bound. 

Second, with only knowledge of c = ;^ Xlt observe that '^t,a < 0{c) and plug in. 
Third, observe that by Jensen’s inequality c < (where ^ ^ 
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